The Better Alignment Among Output Alignments
نویسندگان
چکیده
Abstract— The sequence alignment is the one of the most fundamental technique of nowaday molecular biology. Computer scientists mapped it into the longest common sequence (LCS) problem which is a well-studied problem in algorithms. The optimal alignment score of LCS can be found in O(n1n2) time with the dynamic programming technique, where n1 and n2 is the lengths of the two sequences. Scientists presented many scoring functions to measure the goodness of the alignments in different criteria, such as affine gap penalty, and score matrices like PAMs, Blosums, Gonnets. All of these scoring functions are based on the same core, the dynamic programming. Once the optimal alignment score is found, tracing back the alignment lattice, which is produced during the dynamic programming, will obtain the alignment of the optimal score. Unfortunately, this optimal alignment is not unique in most time and the biologically meaningful alignment may not be optimal alignment. In this paper, we present some new definitions to measure the best alignment of the optimal alignments and illustrate the algorithms to solve them. The proposed algorithms give not only alignment of the optimal score but also more biologically meaningful without increasing the computing complexity of the original algorithm. Additionally new definitions can be used to find the better template when predicting the 3D protein structures. They are also interesting problems even if we do not consider the practical use of them.
منابع مشابه
An Investigation of the Sampling-Based Alignment Method and Its Contributions
By investigating the distribution of phrase pairs in phrase translation tables, the work in this paper describes an approach to increase the number of n-gram alignments in phrase translation tables output by a sampling-based alignment method. This approach consists in enforcing the alignment of n-grams in distinct translation subtables so as to increase the number of n-grams. Standard normal di...
متن کاملRefinement by shifting secondary structure elements improves sequence alignments.
Constructing a model of a query protein based on its alignment to a homolog with experimentally determined spatial structure (the template) is still the most reliable approach to structure prediction. Alignment errors are the main bottleneck for homology modeling when the query is distantly related to the template. Alignment methods often misalign secondary structural elements by a few residues...
متن کاملMultiple molecular sequence alignment by island parallel genetic algorithm
This paper presents an evolution-based approach for solving multiple molecular sequence alignment. The approach is based on the island parallel genetic algorithm that relies on the fitness distribution over the population of alignments. The algorithm searches for an alignment among the independent isolated evolving populations by optimizing weighted sum of pairs objective function which measure...
متن کاملImprovement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments.
Alignment quality may have as much impact on phylogenetic reconstruction as the phylogenetic methods used. Not only the alignment algorithm, but also the method used to deal with the most problematic alignment regions, may have a critical effect on the final tree. Although some authors remove such problematic regions, either manually or using automatic methods, in order to improve phylogenetic ...
متن کاملUsing Traveling Salesman Problem Algorithms to Determine Multiple Sequence Alignment Orders
Multiple Sequence Alignment (MSA) is one of the most important tools in modern biology. The MSA problem is NP-hard, therefore, heuristic approaches are needed to align a large set of data within a reasonable time. Among existing heuristic approaches, CLUSTALW has been found to be the progressive alignment program that provides the best quality alignments, while the program POA provides very fas...
متن کامل